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Abstract 

A new class of random composition structures (the ordered analog of Kingman's partition structures) 
is defined by a regenerative description of component sizes. Each regenerative composition structure is 
represented by a process of random sampling of points from an exponential distribution on the positive 
halfline, and separating the points into clusters by an independent regenerative random set. Examples are 
composition structures derived from residual allocation models, including one associated with the Ewens 
sampling formula, and composition structures derived from the zero set of a Brownian motion or Bessel 
process. We provide characterisation results and formulas relating the distribution of the regenerative 
composition to the Levy parameters of a subordinator whose range is the corresponding regenerative 
set. In particular, the only reversible regenerative composition structures are those associated with the 
interval partition of [0, 1] generated by excursions of a standard Bessel bridge of dimension 2 ~ la for 
some a G [0, 1]. 
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1 Introduction 



A composition of a positive integer n is a sequence of positive integers A = (ni, . . . , Uk) with sum 'SjUj = n. 
Each Hi may be called a part of the composition. We will use the notation A |= rt to say that A is a 
composition of n. A random composition of n is a random variable C„ with values in the set of all 2"~^ 
compositions of n. A composition structure (Cn) is a Markovian sequence of random compositions of n, 
for n = 1,2,..., whose cotransition probabilities are determined by the following property of sampling 
consistency jl()lll5l : if n identical balls are distributed into an ordered series of boxes according to {Cn), 
then Cn-i is obtained by discarding one of the balls picked uniformly at random, and then deleting an 
empty box in case one is created. We study composition structures with the following further property: 

Definition 1.1 A composition structure (C„) is regenerative if for all n > m > 1, given that the first 
part of Cn is m, the remaining composition of rt — m is distributed like Cn-m- 

According to our main result fTheorem l5.2|l . each regenerative composition structure can be represented 
by a process of random sampling of points from the exponential distribution on [0, oo[, and separating the 
sample points into clusters by points of an independent regenerative random closed subset TZ of [0, cx)[. We 
recall in Theorem 15. II the fundamental result of Maisonneuve that every such TZ can be represented 
as the closed range of a suhordinator {St), that is an increasing process with stationary independent 
increments. Each possible distribution of a regenerative composition structure is thereby described in 
terms of the drift coefficient d and Levy measure v of an associated suhordinator. Alternatively, we 
can transform TZ into TZ := \ — cxp(— 7?.) C [0, 1] and replace the exponential sample by a sample from 
the uniform distribution on [0, 1]. In this form the construction is an instance of the ordered paintbox 
representation of composition structures, developed in |10[ll5ll^ . 

Keeping track of only the sizes of parts, and not their order, every composition structure induces 
a partition structure, that is a sequence of sampling consistent partitions of integers, as studied by 
Kingman [241 125) . Passing from compositions to partitions is equivalent to passing from the ordered 
paintbox TZ'^ = [0, V\\TZ to Kingman's paintbox defined by the decreasing sequence of lengths of interval 
components of TZ'^. A partition structure is thereby associated with a typically infinite collection of 
composition structures, each corresponding to a different way of ordering interval components of given 
lengths. We show that if one of these composition structures is regenerative, it is unique in distribution 
fCorollarv l7.3)l . In Section lTITl we also discuss necessary and sufficient conditions for the existence of such 
a regenerative rearrangement. 

Known examples of regenerative composition structures include the compositions associated with the 
ordered Ewens sampling formula jl(J|. and those derived from the zero set of a recurrent Bessel process 
in The partition structures corresponding to these examples are instances of the two parameter 

family of partition structures studied in |28l \'M\ . We show in Section |H1 that each member of this family, 
with positive values of parameters, corresponds to a unique regenerative composition structure. Also 
fTheorem ll0.1)l . the only reversible regenerative composition structures are the members of this family 
associated with the interval partition of [0, 1] generated by excursions of a standard Bessel bridge of 
dimension 2 — 2a for some a € [0,1]. See also Section^ and j^^J, for further examples of regenerative 
composition structures. 

Our definition of regenerative composition stuctures is reminiscent of Kingman's characterisation of 
the one-parameter Ewens partition structure by invariance with respect to deletion of a random part, 
selected in a size-biased fashion. This property is called species noninterference or neutrality in the setting 
of population genetics. We refer to O m |2] for background on partition structures, exchangeability 
and related matters. As shown by James [21], another closely related concept, developed in the setting 
of Bayesian nonparametric statistics, is Doksum's 9 notion of a random discrete probability distribution 
that is neutral to the right. 

From an algebraic viewpoint, our representation of regenerative composition structures is equivalent 
to solving a nonlinear recurrence fProposition \'6.'j^ . The nonlinearity of the recursion reflects the fact 
that the family of probability laws of regenerative compositions is not closed under mixtures. So unlike 
the problems of characterising all partition or composition structures, the problem of characterising all 
regenerative composition structures is not just a problem of identifying the extreme points of a convex 
set. Still, we show in Section El that it can be reduced to such a problem (equivalent to a version of 
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the HausdorfF moment problem), by a suitable non-linear transformation. The Levy data (d, j/) of the 
associated subordinator are thereby encoded in a finite measure on [0, 1]. 

2 Compositions and partitions 

This section recalls briefly some background material on composition structures and their associated 
partition structures. See 28, 29, for a fuller account. For a composition structure (C„), and a 

composition A = (rii, . . . , n^) of n, define the composition probability function p by 

p(A) :=P(C„-A). (1) 

For each fixed n, this function defines a probability distribution on the set of compositions A |= n, and 
these distributions are subject to the following linear relation describing the sampling consistency. For 
A = (ni, . . . , Uk) \= n and /i |= n + 1 we say that /i extends A and write /i \ A if /i is obtained from A by 
either increasing a part rij by one or by inserting a part 1 in the sequence A. The sampling consistency 
amounts to the recursion 

p{\)=Y,K(\p)p{p), p{l) = l (2) 

where the coefficient k(A, /i) equals {rij + 1)/ {n + f ) if /i is obtained by increasing a part rij, and equals 
(j + !)/("- + 1) if A* is obtained by inserting a 1 into a row of consecutive ones 1, 1, . . . , f of length j >Q. 

Regard Cn as a way to partition a row of n identical balls into an ordered series of non-empty boxes, and 
independently of Cn let the balls be labelled by a uniform random permutation of the set [n] :— {\, . . . ,n\. 
This defines a random exchangeable ordered partition C* of the set [n] whose distribution is defined as 
follows. For each particular ordered partition of [n] into k classes of sizes ni, . . . , Uk, say c*, 

P(C: = c*)=f " ) p(ni,...,nfc) (3) 
\ni,...,nkj 

since the multinomial coefficient is the number of such ordered partitions of [n] , and these are equally 
likely. The sampling consistency property of a composition structure (C„) means that (C*) can be 
constructed consistently, in the sense that C*_i is the restriction of C* obtained by deleting element n. 
Then C,„ is the ordered record of sizes of classes of C* , and the entire sequence (C* ) defines an exchangeable 
ordered partition of the set N of all positive integers. 

Ignoring the order of classes yields a random exchangeable partition 11 of the set N. The restriction 
n„ of n to [n] is obtained by ignoring the order of classes of C*. So for each particular partition tt of [n] 
into k classes whose sizes in some order are ni, . . . , n^, 

P(n„=7r)=f \ VpK(i),...,n^(fc)) (4) 

where the sum is over the k\ permutations of [k], corresponding to the kl different ordered partitions c* of 
[n] derived from the given partition vr of [n]. This symmetric function of (ni, . . . , n^) is the exchangeable 
partition probability function (EPPF) of |^|^. Note by construction that the partition of n defined by 
the decreasing rearrangement of sizes of classes of n„, or of C*, is identical to the decreasing rearrangement 
of the parts of d- Such a sequence of random partitions of n, subject to a consistency constraint, is 
called a partition structure. 

3 Regenerative composition structures 

Let {Cn) be a composition structure with composition probability function p. Let F„ denote the size of 
the first part of C„, and denote the distribution of Fn by 

q{n : m) ■.^f{Fn = m) = ^ I(ni = m)p(ni, . . . , rtfc), 1 < m < n, (5) 

(ni,...,nfc) 
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where the sum is over ah compositions (ni, . . . , n^) of n, and 1(- • •) denotes the indicator fimction which 
equals 1 if • • • and else. We call q the decrement matrix of the composition structure (C„). 

Proposition 3.1 ^ composition structure (C„) is regenerative in the sense of Definitions^ ijf for 
each n — 1,2, . . . the distribution of Cn is determined by the product formula 

k 

p{ni,..., Uk) = Y[QiNj -rij) (6) 

for each composition (ni, . . . , Uk) of n, where Nj := nj + ■ ■ ■ + Uk and q{n : m) is the decrement ma- 
trix defined by (0 . Thus the law of a regenerative composition structure is uniquely determined by its 
decrement matrix. 

Proof. This is easily shown by induction on the number of parts of a composition. □ 

Note that if q{2 : 1) = 1 then q(n : m) = l(m = 1), meaning that each C„ is a pure singleton 
composition, with p(l, !,...,!) = 1. Whereas if q(2 : 2) = 1 then q{n : m) = l(m = n) meaning that 
each Cn is a trivial one-part composition with p(n) = 1. These facts are easy to check using (j^J and 
p > 0, and they are intuitively obvious: q{2 : 1) = 1 (respectively q{2 : 1) = 0) means that two randomly 
sampled balls never come from the same box (respectively from different boxes). It can be shown that 
(7(4 : 2) > implies < q{n : to) < 1 for all 1 < m < n and n > 1 and therefore < p{X) < 1 for 
A 1= n > 1. In the case 5(4 : 2) = and < q{2 : 2) < 1 we have q{n : 1) + q{n : n) — 1 for all n, hence 
p{X) > only for compositions of the form A = (n) or A = (1, 1, . . . , 1, k) with fc > 1. 

The product formula 10 identifies C„ with the sequence of decrements of a transient Markov chain 
Q71 ■— Qn{0), Qn(l), • ■ • with values in {0, ... , n}. This chain has decreasing paths starting from the state 
Qn(0) — ri, with the terminal state and time-homogeneous triangular transition matrix {q{n : n—m), 1 < 
m < n < 00). In this interpretation the parts of a composition ni, . . . ,n^. are the magnitudes of jumps 
of the chain, while {Ni, . . . , N^) is the path of Qn prior to absorbtion. For example, if Cg = (3, 2, 1, 2), 
the path of Qs is 

(Q8(0),Q8(1),...) = (8,5,3,2,0,0,...). 

Consider now the joint law of two compositions derived from a regenerative composition Cn by a 
random splitting, say C„ = (C^,C^), where is a composition of to(C^) S {1, . . . ,n}, and C^ is the 
remaining composition of n — to(C^), regarded as a trivial sequence with no elements if to(C^) = n. 
Suppose that the number of parts of C^ is a randomised stopping time of the chain Q„, meaning [33] 
that for each 1 < fc < n, given C„ with at least k parts, the conditional probability that C^ has exactly 
k parts depends only on the first k parts of C„. Equivalently, for each A = (ni, . . . ,n^) \= n and each 
A< = (ni, . . . , nfe) for some 1 < fc < 

P(C< = A<|C„ = A) = /„(A<) (7) 

for some function /„ of compositions of to for 1 < to < n. The strong Markov property of Qn then 
implies that 

(i) the compositions C,f and Cn are conditionally independent given to(C^), and 

(ii) for each 1 < m < n, given to(C^) = to the remaining composition of n — to is distributed like 

Conversely, we record the following proposition which applies in particular to the splitting scheme 
defined by lO with /n(ni, . . . , n^) — rik/n. In terms of balls in boxes, such a split is made just to the 
right of the box containing a ball picked uniformly at random. 

Proposition 3.2 Suppose a composition structure (Cn) admits a random splitting Cn — (C^,C^) for 
each n, such that holds with /„(to) > for all 1 < m < n, and (ii) holds. Then {Cn) is regenerative. 
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Proof. Let p denote the composition probability function of (C„), as in Q}. By definition, (C„) is regen- 
erative iff for all f < TO < n and all compositions A-* of n — m 

p{m,X>) = q{n:m)p{X>) (8) 

for some matrix q(n : m), which is then the decrement matrix of (C„). Whereas (ii) holds iff for all 
1 < TO < 71 and all compositions A^ of m and A-^ oi n — m 

^ UX<)p{X<,Xn=q{n:m)p{X>) (9) 

A<|=m 

for some matrix q{n : to), in which case q{n : to) = V{m{C^) — to). Assuming that ^ holds, ^ is 
obvious for to = 1 with q{n : 1) = q{n : 1)//„(1). Proceeding by induction on to, suppose that ® holds 
for all 1 < TO < n, and that ^ has been established with to' instead of to for all 1 < to' < to < n. Apart 
from the term fn{m)p(m, X^), all terms of the sum in involve compositions A^ all of whose parts 
are smaller than to. So the inductive hypothesis allows us to write these terms as /ri(A^)/i„(A^)p(A^) 
where ft,„(A<) is a product of entries of the decrement matrix q. Now rearrange JHl to isolate the term 
fn{m)p(m, X^) on the left, and observe that p{X^) is a common factor on the right, to complete the 
induction. □ 

Our aim now is to describe as explicitly as possible all matrices q which define a composition structure 
by means of ©. We start with an algebraic description: 

Proposition 3.3 A non-negative matrix q is the decrement matrix of some regenerative composition 
structure iff q{\ : 1) = 1 and 

TO+1 71 + 1 — TO 1 

qin : m) — q(n + 1 : to + 1) H q[n + 1 : to) H q(n + 1:1) q(n : to) (10) 

71+1 n+1 71+1 

for 1 < m < n. 

Proof. We will show first that the condition H1Q(I is sufficient, that is (|l()|l and © imply Indeed, 
assuming ((TT)| and © 

q{n : n) = q[n + 1 : n + 1) H q{n + 1 : n) H r'zl'^ + 1 ■ l)?!" ■ 

71+1 71 + 1 

implies readily 

p(7i) = p{n + 1) H — p(n, 1) H —7^(1' ^) 

71 + 1 n + 1 

which means (|2l for all one-part compositions. Now suppose ^ holds for all compositions with less than 
k parts, and let A |= ?i be a composition with k parts. Write A in the form A = (to, A') where A' \— n~ m. 
We have by the induction hypothesis and © 



^ k(A,m)p(m) = ^-tP(1,A) + ^^^p(to+1,A') + ^^^J^ K(A',Ai')p(TO,/i') = 
^-^ n+1 71+1 71. + 1 

1 TO +1 71 — 771 + 1 

-q{n + 1 : l)g(7i : m)p{X') H q{n + 1 : to + l)p(A') H ; — ^(ti + 1 : m)p{X') 

71+1 71+1 71+1 



which by H10() and (|HJ is equal to g(7i : m)p(X') ~ p{X) and the induction step is completed. 

Conversely, assuming (0) and JBJ the recursion (|10|l follows by a similar argument with k = 2. □ 

4 First examples 

Example 1 (Geometric sampling [71I221)- Imagine infinitely many players labeled 1,2,... who ffip 
repeatedly the same coin with fixed probability x S]0, 1] for tails. In the ffi'st round, each of the players 
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tosses the coin and those who flip tails drop out. In the second round each of the remaining players 
must toss again and those who flip tails drop out, and so on. If we restrict consideration to players 
labeled 1, . . . ,n, a composition C„ arises by arranging the players into groups as they drop out. These 
compositions are sampling consistent by exchangeability among the players and they form a regenerative 
composition structure because 'all rounds are the same'. Equivalently, we could attribute to each player 
j an individual value , the number of rounds the player remains in the game, and tie the players into 
blocks by equality of their individual values. The are independent with same geometric distribution. 
The probability that of n players exactly m tie for the minimum value min(^i, . . . , is equal to 

(")a;™(l — x)"^™ 

g(n : m) = ^2- — , ■m=l,...,n 

1 — (1 — xj" 

which is the binomial distribution conditioned on a positive value. Note that the one-part or the pure 
singleton compositions appear for a; = 1 or a: | 0, respectively. 

It is the memoryless property which makes the geometric distribution work, and sampling from any 
other fixed distribution on integers would not produce a regenerative composition. Still, it is possible to 
preserve the regenerative feature by randomising the distribution in a very special way. 

Example 2 (Stick-breaking compositions ^01 HSl El EOl El E3). Let (Xk) be independent copies of 
some random variable X with < X < 1. Think of Xk as the probability of tails for the fcth coin. 
Modify the algorithm in the previous example by requiring that at round k each of the remaining players 
must toss the fcth coin. It is easily seen that the resulting composition structure is regenerative. Fixing a 
group of n players and conditioning on the number of players that drop out at the first coin-tossing trial 
we obtain the recurrence 

q{n:m)^ j^^^E (X™(1 - X)"-™) + E (1 - X)" g(n : m) 
resulting in the decrement matrix 

□ E(X"(1-X)— ) 
q(n : m) = ' , m=l,...,n (11) 

' E(l-(1-X)") ' ' ^ ' 

which says that q(n : •) is a mixture of binomial distributions conditioned on a positive value. 

For example, if X is uniform on [0, 1], then q{n : m) = n~^, that is a discrete uniform distribution for 
each n. More generally, if X has a beta distribution with parameters (1, 0), 9 > 0, the decrement matrix 
becomes 

n\ \0]n-mml 



q{n:m)^[ - \ i"y-- -- (12) 
\mj [9 + l]n-in ' 

where 

[9]n:=e{0 + l)---{9 + n-l) (13) 

is a rising factorial. The corresponding partition structure is well known to be that defined by the 
Ewens sampling formula jll| . The individual values of the players are now only conditionally i.i.d., with 
conditional distribution 

P(e, =l|Xi,X2,...)) - (l-Xi)---(l-X,_i)X,. 

Additional randomisation allows the same composition structure to be defined in another way. Mark 
the players by independent uniform [0,1] random variables {uj), also independent of (Xk). Consider a 
random partition of [0, 1] into intervals by points 

fe 

Yk = l~l[{l-X,), k^l,2,.... (14) 
1=1 
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The number of intervals is finite if V{X = 1) > or infinite otherwise. Group together those players 
whose individual marks fall in the same component ]Yk-i, Yk[ , and maintain the order of groups from the 
left to the right. This sequential algorithm of random interval division is often referred to as stick-breaking 
or as a residual allocation model. Note that in the stick-breaking case the partition of [0, 1] has a first 
(leftmost) interval, a second interval, and so on. 



Example 3 (Brownian bridge [221 )■ Consider the partition of [0, 1] by the set of zeros of a Brownian 
bridge. This set is perfect, i.e. a compact set with no isolated points. Given a uniform sample (uj) group 
together all sample points which fall into same excursion interval. This defines a composition structure 
which is regenerative, by a self-similarity property of the set of zeros. The decrement matrix is described 
later by formula (|39|l for a = 6* = 1/2. Unlike the stick- breaking case there is no leftmost interval. 



Example 4 (Brownian motion, meander case (2^1 )■ Same as Example 3 but we take the set of zeros 
of a Brownian motion on [0, 1]. The collection of intervals is not simply ordered, but there is a definite 
last (i.e. rightmost) interval, known as the meander interval, whose right endpoint is 1. The decrement 
matrix is described by formula H39|) for a — 1/2, 9 — 0. 



Example 5 (Myriads of singletons). Fix d > and a distribution of X on ]0,1] . Modify the stick- 
breaking partition of Example 2 by assuming two types of independent residual allocations. At each 
odd step the stick is broken with residual measure beta(l, d~^), and at each even step the stick is broken 

according to X . That is, consider independent random variables Zi, Xi, Z2, X2, . . . with Zi ^beta(l, d~^) 

and Xi ^ X, and define 

k fe-1 

Y2k+i = !-(!- Zk+i) - Z,){1 - Xj) , y^fe = 1 - (1 - ^fc)(l - ^fe) z^){\ - Xj). 

Consider a random closed set TZ which includes endpoints :~ and 1 and the union of intervals 
\Y2k: ^2fc-i-i], A; = 0, 1, . . .. If P(X = 1) = the interval partition has infinitely many components. 

Draw an independent sample of uniform points {uj) and define a composition by requiring that the 
sample points which hit components [l2fe,i^2fc+i] of TZ become singletons, while all those which fall in 
a particular gap ]l2fe-i-i, i^2fc+2[ are grouped together. For n large, a typical composition of n will start 
with a myriad of singleton parts 1, 1, . . . , 1 whose number is of the order of n, followed by one part whose 
size is of the order of n, followed by a myriad, etc. 

For m > 1 conditioning on the number of sample points out of n which fall into ]yi,i2[ leads to a 
recursion 

q{n : m) = ((1 - ^)"^"(1 - X)""") + E ((1 - Z)"(l - X)") q{n : m) 

which implies q as in but with additional term nd in the denominator. 

The total asymptotic frequency of myriads, say /, is equal to the Lebesgue measure of TZ and satisfies 
a distributional equation 

/^Zi + (1-Zi)(l-Xi)/' (15) 

where f , Zi, Xi are independent and f ^ f. Analysis of this equation shows that the moments of / are 
given by a simple formula which we record later in 132|1 . 

5 General representation 

Background on subordinators and regenerative sets Let d > and v he a, measure on ]0, 00] 
satisfying 

min (1, 2:) :/(dz) < 00 . (16) 
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Here and henceforth the integral is over the closed interval [0, oo]. There is no mass at but we allow 
the case when v gives a positive mass to z = oo. We also require that either d or be nonzero. Consider 
a Poisson point process on [0, C!o[ x [0, oo] with intensity measure Lebesguexi/. Denoting a generic point 
of the process (r, , Aj), define the process 

S't = d<+^Aj, i>0. (17) 

Tj<t 

The process (St) is a subordinator, that is a Levy process with increasing cadlag paths, with Sq = and 
St 1 oo. For p > let ^{p) be the Laplace exponent of the subordinator defined for p > by 

]E[exp(-p5t)] =exp[-t$(p)]. 

Let i>{dz) be the Levy measure associated with the subordinator, and let ly{dx) be the image of ly via the 
transformation x = I — . According to the Levy-Khintchine formula, 

/>oo 

$(p) = / {l-e-P')iy{dz) + pd (18) 
Jo 

= / {l-{l-x)P)^dx)+pd (19) 
= [ p{l-x)P-^D[x,l]dx + pd. (20) 

Let 

n = {St,t>oY' 

be the closed range of the subordinator. For a random closed subset TZ of [0, oo] let 

G(7^, i) := sup 7^ n [0, t] and D(7^, t) := inf 7^^]^, oo] (21) 

with the usual conventions sup0 = and inf = oo. Following Maisonneuve |^ and Bertoin 5 , call TZ 
regenerative if for each t G [0, oo[ , conditionally on {D{TZ, t) < oo}, the random set {TZ — D{TZ, t)) n [0, oo] 
is distributed like TZ and is independent of [0 , D{TZ, t)] n TZ. The following representation of regenerative 
sets is fundamental: 

Theorem 5.1 (Maisonneuve [25] ) The closed range TZ of a subordinator [St] is a regenerative random 
subset of [0,oo]. Moreover, every regenerative random subset TZ of [0,oo] has the same distribution as 
the closed range of some subordinator {St,t > 0), whose Laplace exponent $ is uniquely determined up 
to constant multiples. 



Standard exponential sampling Let (e^) be a sequence of independent standard exponential vari- 
ables, independent of the subordinator (St), and let ei„, . . . , e„„ be the first n sample points ei, . . . , e„ 
arranged in increasing order. Define a partition of the set {1, . . . ,n} into blocks of consecutive integers 
by letting j and j + 1 belong to different blocks iff the closed interval [ej„ , £j+i,n] contains some point 
of TZ, for j < n. Note in particular that {j} is a singleton block if ejn € TZ. Define a composition C„ of n 
by the sequence of counts of block-sizes of this random partition of {1, . . . , rt} into blocks of consecutive 
integers, from the left to the right. It is obvious by construction that (C„) is a composition structure, 
call it the composition structure derived from the subordinator by standard exponential sampling. 

Introduce the binomial moments 

<P{n:m) = (1 - e-")" e-'"-"'^ z^(dz) -I- nd l(m = 1) (22) 

= x'" (1 - x)"-™ ;?( dx) -f nd 1 (m = 1) (23) 
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for 'i/{dx) the image of i'(dz) via x = 1 — e~^, as in (fH^ - lfH^ . Note by (fTH|l that the integrals are finite for 
1 < m < n, and that these quantities are hnearly related to the Laplace exponent $ by the elementary 
identities 

n 

= ^$(n:m), n = l,2,... (24) 



rn— 1 



n 



^ j=0 ^ ^ 



4>(n:m) = ( ) > ( .)$(n-m + j), 1 < m < n (25) 



where $(0) = 0. 



Theorem 5.2 (i) T/ie composition structure derived from a suhordinator by standard exponential 
sampling is regenerative, with decrement matrix 

q{n ■■ m = \ ' . 26 
<3>(n) 

(ii) Every regenerative composition structure can be so derived from some subordinator. 

(iii) The Levy data (d, i') of the subordinator is determined uniquely up to a positive factor by the 
regenerative composition structure. 



To prepare for the proof, we start by recalling some known facts about the passage of a subordinator 
across an independent exponential level. 

Lemma 5.3 |29j Let e be an exponential random variable with rate p, independent of TZ which is the 
closed range of a subordinator (St) with Laplace exponent Let := G{TZ,e), :— D{TZ,e), and 
Ae D^ — Gg, so that almost surely A^^ is the length of the interval component of [0, oo] \ TZ which covers 
e, with — if e GTZ. The random variables G^ and A^ are independent, with Laplace transforms 

Eexp(-sG,) = — — y, Eexp(-sA,) = — . (27) 



Note that the second formula in (|27|l is equivalent to 



»(A. e d.) ^ (l-^---)Kd^) +P^^o(d^) (28) 

$(p) 



where Sq is a unit mass at 0. 



Proof of Theorem 15.21 (i) . The regenerative property of the composition structure derived from a sub- 
ordinator follows easily from the memoryless property of exponential distribution and the regenerative 
property of TZ at time _Di„ := D(TZ, ei„). To derive (|26|l . observe that ei„ is exponential with rate n and, 
by the construction, 

q{n : m) = P(Dl„ G [Cmn , Cm+l.n]) 

(with the convention e„+i,„ — oo). Let Gin '■= G(TZ,ein) and Ai„ :— Din ~ By Lemma |5. 31 Ai„ 

has distribution H28|l for p — n. Moreover, given Ai„ = z with z > 0, the random variable ei„ — Gin is 
distributed like exponential variable e{n) with rate n conditioned on e{n) < z. So the probability that 
ein hits the closed range TZ of the subordinator (causing a singleton) is 

77 d 

= em) = F(Ai„ = 0) = -— (29) 
$(n) 
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and given the complementary event that ein misses TZ, with ei„ — Gin ~ x > and Ai„ = z > x, the 
conditional probability that Di„ G [cmn , em+i,n] equals 



n — 1 
m — 1 



So the probability that ei„ finds a gap in TZ, and exactly m of the n exponential variables ei, . . . , e„ fall 
in that gap, is 



Hn) Jo 



by application of the formula me~"^^{l — e^~^)"^~^dx = (1 — e"^)™ which has an immediate interpre- 
tation in terms of the order statistics of m independent exponential variables. Now (|26|l follows because 
q{n : m) is given by the above formula for m > 1 and has the additional term nd/$(n) from (|29|l for 
m — I. □ 

To prepare for the proof of the rest of Theorem 15. 21 we record a sequence of four preliminary results. 
The first is elementary. 

Lemma 5.4 For I < m < n let ^{n -. m) and $(n) be real variables related by (|^ with $(0) = 0. 
Then the identity (I24II holds. Moreover, (I25|l for 1 < m < n < n' implies the recursion 

m ^1 n — 771 1 

$(n : m) = <i>(n + 1 : to + 1) H <^{n + 1 : m), 1 < to < n < n' . (30) 

n+1 n+1 

Conversely, (|5n|) and (|23I for I < n < n' imply 

A sequence $ such that $(ri : m) defined by (|25|l is non- negative for all n and to is known as a completely 
alternating sequence 4 , and there is the following integral representation of such sequences: 

Proposition 5.5 gl Proposition 6.12 for k = 1, p. 134] A sequence ($(n),n > 0) with $(0) = 
and $(n) > for n > is such that all entries <i>(n : to) defined by H25() are non-negative if and only if 
there is the integral representation H19|l for some measure v on ]0, 1] and d > 0. Moreover v and d are 
uniquely determined by $. 

Lemma 5.6 Suppose that a sequence of numbers ($(n),ri > 0) with $(0) = satisfies ^{n) > for 
some n < n' , and is such that each entry $(n : to), 1 < m < n < n' , of the matrix H25(l is non-negative. 
Then $(n) > for all \ < n < n' , and the entries of the matrix H26|l with 1 < m < n < n' are 
non-negative and satisfy (|10|l for this range of indices. Moreover, if the entries $(n : to) of the matrix 
(I25|l are non-negative for arbitrary n then (|26|) is the decrement matrix of some regenerative composition 
structure. 

Proof. We apply Lemma Dividing H30|) by $(n + 1) and substituting it in the to-be-checked (|10|) . we 
transform it by elementary algebra to 

$(n + 1 : 1) = (n + l)($(n + 1) - $(?i)) 

which is true as a special case of (I25f) . □ 



Lemma 5.7 The decrement matrix of a regenerative composition structure can be represented in the 
form (|26|l . by a matrix (<i>(n : to),1 < m < n < 00) with non-negative entries satisfying (I30|l and (124(1 . 
The matrix $ is determined by q uniquely up to a positive factor. 
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Proof. The statement is only nontrivial when < p{n) < 1 for n > 2. So let us consider a decrement 
matrix with entries < q{n : m) < 1 for n > 1. Fix n' and set by definition : m) := q(n' : m) for 
m = 1, . . . , n'. Consider the unique solution ($(n : m), 1 < to < n < n') to (|30|l with the values q{n' : to) 
at level n' . Because q{n' : m) > 0, it is easily seen that $(n : m) > for 1 < m < n < n' and therefore 
<I>(n) := ^{n : 1) + • • • + $(n : n) > for n < n' (and <i>(n') = 1). By the first assertion of Lemma f5. 61 
and the remark before, the elements $(n : m)/^{n) satisfy the recursion H1U|) for n < n' and for n = n' 
they coincide with q{n' : to). Thus by the uniqueness of solutions to (|10|) for n < n' with given values at 
level n' we conclude that q(n : m) coincides with <l>(n : m)/^{n) for aXW <m<n <n' . 

Keeping n' fixed, suppose there is another representation q{n : to) = $(n : m)/^{n), n < n' , then 
: to) — ^{n')q{n' : to), thus arguing as above and using linearity we get $(n : to) = 3'(n')$(n : to) 
for 1 < m < n < n' . Thus the representation for given n' is unique up to a multiple, and it becomes 
unique subject to a normalisation constraint. 

Assuming the normalisation $(1 : 1) = 1, the finite matrices ($(m : n), 1 < m < n < n') constructed 
for each n' are consistent as n' varies, by the uniqueness for each particular ri', thus they constitute an 
infinite matrix and the desired representation follows. □ 



Proof of Theorem 15.21 fii'l and (iii). These results follow immediately from Lemma |5.7I and Proposition 

Em □ 

For an alternate proof of (ii), see J7j. Also, (iii) can be deduced from Theorem lS.ll and a general fact 
about composition structures (151 Corollary 12]. 

Class frequencies If the regenerative composition structure (C„) is derived from a subordinator by 
standard exponential sampling, the associated composition C* of the infinite set N is simply constructed 
by assigning i and j to different classes iff the closed interval with endpoints ti and tj intersects TZ. 
The ordering of classes is maintained according to the order of the tj associated with the classes. The 
random set of positive integers j whose e.j falls in a particular interval component of Tif^ :— [0, oo] \ TZ 
forms a positive class, while each j whose ej hits TZ forms a singleton class. By the law of large numbers, 
the probability assigned to an interval component of TZ"^ by the standard exponential distribution is the 
frequency of the corresponding class of C*, that is the almost sure limit as n — > oo of the proportion of 
elements of [n] which belong to the class. For instance, if ]a,6[c TZ'^ is the interval component which 
covers ei, then for large n the class of C* containing element 1 will have approximately n[e^°' — e^^) 
elements, so there will be some part of C„ of this size. We note the following Corollary of Theorem 15. 21 

Corollary 5.8 Let f denote the random frequency of the union of all singleton classes in the exchange- 
able random partition of N associated with a regenerative composition structure with decrement matrix 
(P^ . Then 

f = d exp{~St)dt (31) 
Jo 

where {St) is the associated subordinator with Laplace exponent $ and d is the drift coefficient of (St), 
and the distribution of f on [0, 1] is determined by the moments 

r?' d" 

Proof. The derivation from (St) by standard exponential sampling gives 

f = e"^ l{St = z for some t > 0) dz 
Jo 

and H31|) follows by the change of variable z ^ St- This change of variable follows by noting that the 
function t i—> St is almost everywhere differentiable with derivative d. Formula (|32|l can now be read from 
the work of Carmona, Petit and Yor ,8| Proposition 3.3], or derived from ()15|l . □ 
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Extensive discussion of the exponential functional exp{—St) dt is found in [HIE]- See ^Hl for 

further applications to regenerative composition structures. 

6 Multiplicatively regenerative sets 

By mapping [0, oo] onto [0, 1] via z t-^ 1 — we transform a subordinator (St) into a multiplicative 
subordinator St '■— I ~ exp(— S't): for t' > t the ratio (1 — S't')/(1 — St) has same distribution as 1 — St'-t 
and is independent of (S'„,0 <u<t). This construction appears also in The counterpart of 

(ini) is 

St = l-e-''l[{l~A,) 

Tj<t 

where Aj = 1 — exp(— Aj) and the product is over the atoms (tj, Aj) of a Poisson point process in 
the strip [0, oo[ x [0, 1], with intensity measure Lebesguexi/ where v is the image of the measure v via 
z ^ \ — e^^. Note that the mapping preserves order, so that [St) increases from to 1. 

Let TZ := 1 — exp{—TZ) be the closed range of the multiplicative subordinator (speaking of closed 
subsets of [0, 1] we shall always mean that the points and 1 are contained in the set). The transformation 
z 1 — takes an exponential sample (ej) into a uniform sample {uj). The regenerative composition 
structure (C„) derived from the subordinator (St) by exponential sampling can now be described as 
follows: Cn is induced by separating the first n uniform variables Uj by the points of TZ. Note that the 
frequencies of positive classes derived from (C„) now coincide with the lengths of open interval components 
of TZ'^ ~ [0, 1] \ TZ, and remaining frequency of singletons /, as in Corollarv l5.8[ is the Lebesgue measure 
of 

For a closed subset R of [0, 1] and z € [0, 1[ such that RO ]z, 1[7^ 0, we can define another closed set 

R{z):^{j^^^^^y. yeRn[DiR,z),i]^ (33) 
which is the part of R strictly to the right of D{R, z), scaled back to [0, 1]. 

Definition 6.1 A random closed set TZ C [0,1] is called multiplicatively regenerative if, for each 
z e [0,1[ , conditionally on {D{TZ,z) < 1} the random set TZ{z), defined as in is independent of 

[0 , D{TZ, z)] n TZ, and has the same distribution as TZ. 

The following proposition is easily checked: 

Proposition 6.2 For random closed sets TZ C [0, 1] and TZ C [0, oo] related by TZ ^ 1 - exp{~TZ), the 
random set TZ is regenerative iff TZ is multiplicatively regenerative. 

As a variation of CoroUarv lti.SI a condition for multiplicative regeneration of a random closed subset 
TZ of [0, 1] can also be given in terms of a single independent uniform variable. 

We associate each composition {ni, . . . ,nk) of n with the finite closed set whose points are partial 
sums of the parts of ni, . . . , divided by n; e.g. the composition (4, 2, 3, 1) of 10 is associated with the 
set {0, 0.4, 0.6, 0.9, 1}. Thus a composition structure (C„) is associated with a sequence of random sets 

{Tin)- 

Lemma 6.3 |15j Let (Cn) be a composition structure and let (TZn) be the associated sequence of random 
sets. Then TZn converges almost surely (in the Hausdorff metric) to some random closed subset TZ, and 
(Cn) is distributed as if by using TZ to separate the points in a random sample of uniform [0, 1] variables 
independent ofTZ. 

From Theorem 15. 21 Proposition K . 21 and Lemma Ft). 31 we deduce 
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Corollary 6.4 The composition structure (Cn) is regenerative iffTZ is multiplicatively regenerative. 

As indicated in ^7], it is also possible to prove Corollarv l5 . 41 directly, and then retrace the above argument 
to obtain an alternate proof of Theorem 15. 21 

A sufficient condition for regeneration We note that in the usual definition of a regenerative 
random subset TZ of [0, oo], as in Section |5l the independence of the two random sets TZt '■— — 
D{TZ,t)) n [0,oo] and [0, D{TZ,t)] n TZ for all t can be replaced by the apparently weaker condition of 
independence of the random set TZt and the random variable D{TZ, t) for all t. This is due to the following 
result: 

Corollary 6.5 Let TZ be a random closed subset of [0, oo], let e be an exponential random variable 

with rate 1 independent ofTZ, and let TZe :— {TZ — DiJZ, e)) n [0, oo]. IfTZ^^TZ and TZ^ is independent of 
D{TZ, e) then TZ is regenerative. 

Proof. Let (C„) be the composition structure derived from TZ by the standard exponential sampling 
with variables (ej). Then split Cn = {C^,C^), where is the sequence of non-zero numbers of ej for 
1 < J < JT- falling in complementary intervals of TZ up to and including the count in the interval containing 
ei. This splitting of C„ is the example preceding Proposition l3.2l hence by the assumption on TZ and the 
memoryless property of the exponential distribution, it satisfies the assumption of Proposition 13. 21 The 
conclusion now follows by application of Proposition l3.2l Theorem 15 .21 CoroUarv Iti .41 and Proposition l6.2l 
□ 

7 Parametrisation of decrement matrices 

The representation q{n : m) = $(n : m)/<^{n) provides one parametrisation of the regenerative compo- 
sition structures in terms of a sequence ($(?T.),n > 1). To be probabilistically meaningful, this must be 
the sequence of evaluations of some Laplace exponent at positive integer values. But we may also regard 
the expressions for q{n : m) as a collection of rational functions in variables $(n),n > 1. This section 
presents some alternative parametrisations of regenerative composition structures, and discusses their 
probabilistic and algebraic relations to each other. 

7.1 Structural moments 

One meaningful collection of parameters is the sequence of diagonal entries 

p{n) = q{n : n) 

which starts withp(l) — 1. We call these diagonal entries of the decrement matrix the structural moments 
of composition structure, as they coincide with moments of the structural distribution S: 

p{n) = / x"-^ E(da;) 
Jo 

where E is the distribution of the length of the interval component of TZ'^ containing a given uniform 
sample point, say ui. This random length is the frequency of the class of C* containing element 1, that 
is a size-biased pick from the collection of frequencies [3T] ■ Note from (|29|) and H19|l , or from Corollary 
15.81 that the expectation of the total frequency of singletons / = Lebesgue(7?.) is the measure assigned 
by S to 0: 

E(/) = S({0}) = d/<i>(l) = d/ (d + X ^dx)) . 

From p{n) — ^(n : n)/^(n), by expanding the numerator by (|25|l we obtain a relation 

Hn){p{n) + (-1)") = J2{-lY+'r.)m , (34) 
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which may be seen as a recursion for $(71), n = 1, 2, . . .. Assuming the initial value $(1) — 1 the recursion 
has a unique solution, which is necessarily positive by Lemma l5.7l Thus the recursion (|34|) allows q to be 
recovered from p(n), n = 1,2,..., by first recursively computing n — 1,2,..., then <l>(n : m) from 

(|25|l and finally using H26|l . Thus we have proved 

Proposition 1 .\ A regenerative composition structure is uniquely determined by the structural mo- 
ments p{n) ~ q{n : n) for n — 1,2, . . .. Each q{n : m) for 1 < m < n is expressible as a rational function 
in the variables p{l) ~ l,p(2), . . . ,p{n). 



To illustrate the result, the first few entries are 

<Z(2:1) = l-p{2) 

1 - 3p(2) + 2p(3) 



<Z(3:1) = 

g(3:2) = 

9(4:1) = 

9(4:2) = 

9(4:3) = 



l-p{2) 
2p(2)-3p(3)+p(2)p(3) 
l-p(2) 

1 - 5p(2) + 8p(3) - 4p(2)p(3) - 3p(4) + 3p(2)p(4) 

l-2p(2) + 2p(3)-p(2)p(3) 
3p(2) - 9p(3) + 6p(2)p(3) + 6p(4) - 9p(2)p(4) + 3p(3)p(4) 

l~2p{2) + 2p{3)~p{2)p{3) 
3p(3) - 3p(2)p(3) - 4p(4) + 8p(2)p(4) - 5p(3)p(4) +p(2)p(3)p(4) 
l-2p(2) + 2p(3)-p(2)p(3) 



The complexity of such formulas increases rapidly with n. 

In general, structural moments do not determine a composition structure uniquely, because they do 
not even determine the associated partition structure. See |31| for further discussion. Since uniqueness 
does hold in the special case of regenerative composition structures, it is natural to seek a characterisation 
of structural moments in this case. There is the following immediate consequence of Proposition !?. II and 
Lemma 15.61 

Corollary 7.2 A sequence p{n), n = 1,2,... with p{l) = 1 and < p{n) < 1 for n > 1 is a sequence 
of structural moments of some regenerative composition structure if and only if the following conditions 
are fulfilled: 

(i) the sequence $(n),rt = 1,2, . . . defined by the recursion H34(l with $(1) ^ 1 is positive, and 

(ii) each $(ri : m), 1 < m < 't- < 00 defined by H25|l is non-negative. 
If this is the case, 

p[n) — — j n > 1 

(1 - (1 - xy'')u[Ax) +nd 

for some d > and some measure v on ]0, 1] with finite first moment. 



Remark. We know that p{n),n = 1,2,... is a moment sequence from the general facts about 
partition structures, or from the interpretation of p{n) as the probability that n balls fall in the same 
box. From an analytical perspective, it does not seem obvious that the nonlinear tranform given by 
p(n) — $(n : n)/<I>(n), n — 1,2,... indeed yields a completely monotonic sequence for arbitrary Laplace 
exponent. 

Because the structural moments are determined by the (unordered) partition structure. Proposition 
17.11 and Kingman's representation of partition structures |23 imply: 



14 



Corollary 7.3 Each distribution of an infinite exchangeable partition of N (which can be identified 
with a partition structure) corresponds to at most one regenerative composition structure. Equivalently, 
for each distribution of a decreasing sequence (Yj) with Yj > and '^Yj < I, there exists at most 
one distribution for a multiplicatively regenerative set TZ C [0, 1] such that the ranked lengths of interval 
components of VJ^ are distributed like {Yj). 



A constructive method to verify if a given exchangeable partition of N is induced by a regenerative 
composition structure amounts to computing q from the structural moments, and then checking that the 
given EPPF coincides with the EPPF computed by formulas © and I0J. 

The general problem of characterising structural distributions of partition structures was posed by Pit- 
man and Yor \6A\ . The characterisation of structural distributions of regenerative composition structures 
provided by CoroUar v 17 . 21 leaves open the following question: given the collection of structural moments 
of a regenerative composition, or given its Laplace exponent $, describe in some way how the classes 
of the associated unordered partition should be arranged to produce the composition. We answer some 
restricted forms of this question in the next section, but do not sec how to answer it in any generality. 



7.2 Singleton probabilities 

Instead of the event 'n balls fall in same box', consider the event 'n balls fall in n different boxes'. Let 
e(n) be the probability of this event, that is 

e(n) 1, . . . , 1) - q(n : l)q{n - 1 : 1) • • • q(2 : 1). 

By the definition and from the representation H26|l we derive 

ein) , , / $(n - 1"; 

^ ' - g(n : 1) = n 1 ^ 



e(n - 1) ^ ■ ^ \^ (jj^yj) 
which can be read as 



This shows that any one of the sequences (e(n),rt > 0), : > 0) or (<i>(ri)/$(l), n > 0) uniquely 
determines each of the other two sequences. 

As is seen from H25f) and (|35|l . in the variables q(n : 1), n = 1, 2, . . . the elements of decrement matrix 
become polynomials 

*:'»)^(:)B-r-("')n('-^^)- (-) 

to be compared with the rational functions of structural moments considered in Subsection 17.11 For 
example 

q(4 : 2) = 2 g(3 : 1) - ^ g(4, 1) - i g(3 : 1) q(4 : 1). 

The definition of e(ji) makes sense for a general partition structure. Thus to check if a given partition 
structure is induced by a regenerative composition structure, we can use the above formulas to translate 
e(n), n > 0, into q and then compare the EPPF resulting from ©, Q with the given EPPF. In particular, 
if a regenerative rearrangement is possible, the sequences {p{n),n > 0) and (e(n),n > 0) must be 
computable from each other, as appears by eliminating the variables <I> from p{n) = <I>(n : n)/$(n) and 
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8 The two-parameter family 



8.1 General setup 

Consider the {a, 9) -partition structure determined by following formula of [281 131| for the distribution of 
n„, an exchangeable partition of [n\: for each particular partition tt of [n] into k classes of sizes ni, . . . ,nk 

mn = n) = ^fiS^flll.aU^, (37) 

where the notation (|13|) is used for rising factorials. This formula defines a partition structure for 
< a < 1 and 9 > 0, and also for some (a, 9) with either a < or 9 < 0. We wish to establish if this 
partition structure can be associated with some regenerative composition structure. 
Following the method in Scction[7|we first compute e{n) as a special case of (|37|l : 

n-l „ 

e(.)=p(l,l,...,l) = n^ 
which leads by application of to 

$(n) _ n[0+l]„_i 



$(1) [2 + 9-a]n-i 
This yields, by virtue of (|30l) or H25|l . the formula 

$(n : to) ( n\ [1 - [9 + 1]„^] 



(38) 



((n — m)a + to6 



$(1) VW [2 + 6* -"]«-! [fi' + n-m], 
Therefore 

$(71: to) (n\ {{n - m)a + m9) 

q[n:m)= = — ^ . (39) 

$(n) \TO / + n — m\m n 

Since g in H39|l is non-negative exactly when < a < 1 and 6* > we conclude that q is the decrement 
matrix of a regenerative composition structure for precisely this range of parameters. 
Observe that the resulting formula 

p{n) = q{n : n) ^ . (40) 

yields the moments of beta(l — a,a + 9), which is the structural distribution for all members of the 
two-parameter family of partition structures. 

Adopting the normalisation $(1) = B(1 — q;,1 + 0), where 

B(a,5) := T{a)T{b) /V{a + h) 

the Laplace exponent extending H38I) becomes 

$(s) = sB(l-a,s + 6i). (41) 

The corresponding measure is determined by the formula 

V[x,l]^ x-'^il^ x)\ 0<x<l. (42) 

It remains to check that the partition structure induced by this regenerative composition structure is 
given by (|37|l . This is done in the following theorem: 



Theorem 8.1 For < a < 1 and 9 > Q the distribution of the exchangeable random partition n„ 
of [n] derived from the regenerative composition structure with Laplace exponent is that of an (a, 9) 
partition defined by formula (|37|) . For other values of (a, 9), besides the limiting case (1, 0) for 9 > which 
generates the pure singleton partition, there is no regenerative composition structure which generates an 
(a, 9) -partition structure. 
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Proof. By the above discussion we can restrict consideration to the case < a < 1 and > 0. By 
appUcation of formulas (0J) , (|SJ| and H39|l , the EPPF derived from the regenerative composition structure 
with Laplace exponent (|41|l is a sum of kl terms of the form 



icnn-"]-- — — 



where the sequence (ni, . . . ,nk) and its tail sums Ni = '^i must be replaced by permutations of the 

sequence and correspondingly transformed tail sums. To match up with (|37() . it just has to be checked 
that the corrresponding sum of fc! terms derived from 

equals 1. But this is easily verified together with the probabilistic interpretation given in the following 
corollary. □ 

Corollary 8.2 In the setting of the previous theorem, given that the blocks of Tin are of sizes ni, . . . , nfc 
when put in some arbitrary order, and given that the first « — 1 o/ these blocks are the first i — 1 blocks of 
the ordered partition C*, the conditional probability that this coincidence continues for one more step is 
the ith factor in H43(l . 

Put another way, given block sizes ni, . . . , nfc and that the first i — 1 blocks have been picked to leave 
blocks of sizes Uj for i < j < k, the next block is the block of index j with probability proportional to 
{Ni — nj)a + UjO. 

Several particular instances of the above results are known, as indicated in the following discussion of 
special cases. 

8.2 Case (0, 6) for > 

In this case the measure v in (|42() is a probability measure, the beta(l,0) distribution. So the above 
theorem and its corollary reduce to the well known fact that the ordered Ewens formula associated with 
beta(l, 9) stick-breaking puts its parts in a size-biased random order 

8.3 Case (a, 0) for < a < 1 

In this case 

^{dx) ~ ax^"^^dx + 5i{Ax) 
is a measure with a beta density on ]0, 1[ and a unit atom at 1. The product formula © reduces to 

k 



which is identical to the formula in [291 Equation (28)]. By comparision of these two formulas, the random 
composition is in this case is identical in distribution to that generated by TZa H [0, 1] where TZa is the 
range of a stable subordinator of index a. In particular, TZa can be realised as the zero set of a Bessel 
process of dimension 2 — 2a. For a = 1/2 this is the zero set of a standard Brownian motion. 

The decrement matrix q in this case has the special property that there is a probability distribution 
/ on the positive integers such that 

n-l 

q{n : m) — f{m) if m < n and q{n : n) = 1 — f{^)- (44) 
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Specifically, 

/(m) = ■ (45) 

m! 

and hence q{n : 71) = [1 — a]„_i/(n — !)!• The work of Young |2H| shows that the only non-degenerate 
regenerative composition structures with a decrement matrix of the form (|44(l . for some probability 
distribution / on the positive integers, are those with / of the form (|45|l . obtained by uniform sampling 
from TZa n [0, 1] for some < a < 1. 

The multiplicative regeneration property of TZa H [0,1] is an immediate consequence of the standard 
regeneration and self-similarity properties of TZa as a subset of [0,oo]. It imphes that TZa n [0, 1] has the 
same distribution as the closure of {1 — exp(— 5f), i > 0} where {St) is a subordinator with no drift and 
Levy measure 

v(dz) = a(l - e-")-"-ie-Mz + 6^(dz) 

on [0, 00] which is the image of v via a; i— s- — log(l — x), so has an atom of mass 1 at 00. 

As a check, let r := inf{i : St =00}, which is the exponential time with rate 1 when the subordinator 
jumps to 00. Then, by application of the transformation and the Levy-Khintchine formula, if we let 
G :— sup7?.Q n [0, 1[ , then we find for s > 

This confirms the well known fact that the distribution of 1 — G is beta(l — a, a). It may also be observed, 
using properties of the local time process {Lt,t > 0) associated with TZa, as discussed in [27], that the 
exponential time t can be represented as 



, ['{l-ty^dLt 
Jo 



for some constant Cq depending on the normalisation of the local time process. The fact that this local 
time integral has an exponential distribution was derived by an analytic argument in J.9. , Corollary 3.4]. 

As discussed in 29 , the length of the last interval component ]G, 1[ of the complement to TZa H [0, 1] 
is a size-biased pick from the collection of the interval lengths, and conditionally on G the remaining 
interval components are in symmetric order; moreover these properties are inherited by the compositions 
of n for every n. CoroUarv 18.21 in this case is new. It makes precise another sense in which given the 
partition of n generated by TZa H [0, 1[ , the smaller blocks tend to come first in the composition of n. 

8.4 Case (a, a) for < a < 1 

Passing to the variable z ^ — log(l — x) we see from ()42|l that the associated regenerative subset of [0, 00] 
has zero drift and Levy measure 

iy{dz) = a{l - e-')-"-^e-"' dz z S ]0, oo[ . 

It can be read from that such a regenerative set is generated as the zero set of the squared 
Ornstein-Uhlenbeck process {Xt) of dimension 2 — 2q; driven by the stochastic differential equation 
dXt ~ 2\/XtdBt -f (2 — 2a — Xt)dt where (Bt) is a standard Brownian motion, and that the image 
of this regenerative set via x — 1 — is the zero set of a Bessel bridge of dimension 2 — 2a. In case 
a = 1/2 this is a Brownian bridge, as in Example 3. In the notation introduced in the discussion of the 
previous case, this corresponds to conditioning TZa H [0, 1] on the event 1 £ TZa- This can be rigorously 
understood by first conditioning on G € [1 — e, 1] and then taking a weak limit as e J, 0. 
The decrement matrix in this case has the special property that 

, , f(m)r(n~m) 

q{n : m) = ^ ^ ) ^ (46) 

r[n) 

where / is given by H45|l and r(n) = [a]„/n! is the probability that a random walk on positive integers 
with step distribution / visits n. Equivalently, the composition probability function is 

, . IIjti/K) 

p(ni,...,?ifc) = " (47) 

r(n) 
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or more explicitly 

p{n„...,n,) = ^yi[\^^^. (48) 

It follows from a result of Kerov |23| that the decrement matrix of a non-degenerate regenerative com- 
position structure can be expressed in the form 1)46(1 for some functions / and r iff it is of the form ((48|l 
for some a €]0, 1[. The same conclusion is also a consequence Theorem 110.11 in the next section. The 
conclusion of Corollary 18. 21 in this case is that given the partition of [n] the block sizes appear in C„ in 
a uniform random order. This can be seen directly from the symmetry of formula H47fl as a function of 
(n-i, • ■ 

8.5 Case (a, ^) for < a < 1, ^ > 

It is known 1211 EDI that an (a, 0) partition of N can be constructed as follows. First construct a 
(0, 9) partition of N, then shatter each class of this partition according to an independent (a, 0) partition. 
This operation restricts naturally to [n] for each n, and can be interpreted in terms of a fragmentation 
operation on the frequencies of classes. This result can be lifted to the level of regenerative composition 
structures as follows. 

Theorem 8.3 For < a < 1 and 9 > 0, let Yq = and let < Yi < Y2 < . . . be defined by the 
independent stick-breaking scheme (|14|) for X with beta{\,9) distribution, let TZa{i) for i = 1,2,... be 
a sequence of independent copies of the range TZa of a stable subordinator, and define a random closed 
subset 'R-{a,e) of [0, 1] by 

00 

= {i}u y ([K,_i,y,] n +7^„(^)]) . 

i=l 

Then 1^(0,9) ^ multiplicatively regenerative random subset of [0, 1], which can be represented as 1^(0,9) = 
1 — exp(— 7^(q g)) where TZ(a.9) is the range of a subordinator with Laplace exponent ((41(1 . and the compo- 
sition structure obtained by uniform random sampling from Ti[a,e) is regenerative with decrement matrix 

Proof. It is easily checked, using the muliplicative regeneration of the stick-breaking scheme, and the self- 
similarity of TZon that 'JZ(a.e) is multiplicatively regenerative. The description of the Laplace exponent 
then follows from Proposition l7.ll since the structural distribution is easily identified. □ 

The particular case a = 9 oi Theorem 18. 31 is largely contained in the work of Aldous and Pitman 
In particular, for a = 6 = 1/2 this construction of the zero set of a Brownian bridge plays a key role in 
the asymptotic theory of random mappings developed in ^ and 

9 The Green matrix 

For a given composition probability function , the Green matrix is defined by the formula 

9{n,j)^ ^ p(A), l<j<n<oo 

where the summation is over all compositions A = (ni, . . . , n^) \= n which have integer j among tail sums 
Nj = n — ni — . . . — (where we set uq = 0). Recalling the interpretation of a regenerative composition 
structure as a consistent family of Markov chains Q„, n = 1, 2, . . ., as in Section|31 g{n,j) is the chance 
that Qn with transition matrix q and initial state n ever visits state j. In particular, g{n,n) = 1. 

Example. For the 2-parameter family we have for I < j < n 
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(i) for {0,9) 

as is well known; 

(ii) for (a,0) 



{n- j)\ 

which by (|44|I - H45|I is the probability that a particular random walk with negative increments started 
at level n ever visits state j; 



(iii) for (a, a) 



a 



9{n,j)= ^ 



(a + n — j) ■ ■ ■ {a + n — I) 



which is the probability of the same event for the random walk of the previous case conditioned to 
hit 0. 



Lemma 9.1 The Green matrix of a regenerative composition structure is the unique solution of the 
recursion 

9{n,j) = ^i^^^|^^^.9(n + 1, J + 1) + ^^^^gin + 1, j) (49) 
n + 1 n + 1 

with boundary condition g{n,n) = 1. 

Proof. The path of the chain (5„, defining a composition of n, is obtained via random deletion of a state 
from 1, 2, . . . , n + 1, then restricting a path of Qn+i to the undeleted states and re-labeling the states by 
ranking them from 1 to n. The event 'Qn visits f occurs when either Qn+i visits j and one of the states 
j -I- 1, . . . , n + 1 is deleted (in which case state j retains the label) or Qn+i visits j + 1 and one of the 
states 1, . . . , j + 1 is deleted (if state j -I- 1 is not deleted it changes the label to j). The first event has 
probability f/(n -|- 1, j)(n -I- 1 — j)/(n -|- 1) and the second g{n + 1, j + -\- 1) / {n + 1) . The events are not 
disjoint and their intersection is the event 'Q^+i visits both j + 1 and j , and state j + 1 is deleted' which 
has probability (7(n + 1, J + l)(7(j + 1 : l)/(n+l). The uniqueness claim is obvious from the recursion. □ 



Next result gives an explicit formula for the Green matrix in terms of the representation (|26|l via 
Laplace exponent. 



Theorem 9.2 The Green matrix of a regenerative composition structure 



IS 



.(-)-(.-.C)g(":Oil^. 



Proof. In view of 

q{j + 1 : 1) = (j + 1) (l 



the first factor in the right side of igHI) equals (j -f l)<l>(j)/((n -f l)<i>(j -t- 1)). Substituting this and (|^ 
into (|49(l . and canceling the common factor (")$(j) the to-be-checked recursion follows from the identity 

A"-^+is(j) - + 1) - A"-^s(j) 

where A is the forward difference operator As(i) := s{i + 1) — s{i) and s is the sequence s{i) ~ !/<&(«) 
for i>l. □ 
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We give one application of the formula. Let L„ be the last part of C„. In the event {L„ = j} the chain 
Qn visits state j and then has the last positive decrement j. The distribution of the last part follows 
from this observation and H5Uf) : 

P(L„ = ,) = g{n,j)qU : j) = $(, : j) f^) g (" " ') ^jTaj ' ^^^^ 



In particular, normalising by ^(l) = 1 for simplicity, 



\Ln = l) = n 



fc=2 



n-l\ (-1)* 



k-lj $(fc) 



(52) 



10 Symmetry 



Each composition structure (C„) has a dual (Cn), where C„ is the sequence of parts of C„ in reverse order. 
If (C„) is derived by uniform sampling from a random closed set TZ G [0, 1], then C„ is derived similarly 
from 1 — 7?.. If (C„) is regenerative, and so is (C„), then (C„) and (C„) must be identical in distribution, 
by CoroUarv 17.31 Equivalently, 7?. = 1 — 7?., in which case we call the composition structure reversible. 
Two degenerate examples are provided by 7?. = {0} U {1} and TZ = [0, 1]. The existence of regenerative 
composition structures which are non-degenerate and reversible is quite surprising and counter-intuitive, 
because the ideas of stick-breaking and multiplicative regeneration suggest that typical interval sizes 
should decay in some sense from the left to the right. However, it is evident from the formula (|47|l 
that for every < a < 1 the regenerative composition structure associated with an (a, a) partition is 
reversible. Indeed, this composition structure is symmetric, meaning that the composition probability 
function is a symmetric function of (ni, . . . ,nk) with respect to all permutations of the arguments, for 
each k. The equivalent condition on TZ is that the interval components of the complement of TZ form 
an exchangeable interval partition of [0, 1], as defined in [2]. We note in passing that a large family of 
symmetric composition structures was derived from the jumps of a subordinator in |32| . See also |16j. 

Theorem 10.1 Let (C„) be the regenerative composition structure derived by uniform sampling from 
a random closed set TZ C [0, 1]. Let Fn be the size of the first part of Cn, and Ln be the size of the last 
part of Cn ■ The following conditions are equivalent: 
(i) P(i^„ = 1) = P(i„ - 1) /or all n; 

ill) Fn = Ln for all n; 

(iii) {Cn) is reversible; 

(iv) {Cn) is symmetric; 

(v) {Cn) is the regenerative composition structure with EPPF (|48l) . associated with an {a, a) partition 
as in Section \HA\ for some a G [0, 1]. 

Before the proof of this result, we read from Theorem l5 . 2l and the discussion of Section lH^ the following 
restatement of the equivalence of conditions (iii) and (v): 

Corollary 10.2 For a random closed subset TZ of [0, 1], the following two conditions are equivalent: 

(i) TZ is multiplicatively regenerative and TZ = 1 — TZ. 

(ii) TZ is distributed like the zero set of a standard Bessel bridge of dimension 2— 2a, for some a G [0, 1]. 

Proof of Theorem \W.l\ According to formula (|26|l . for any regenerative composition structure 

P(i^« = l)-.(n:l)^^(%-^ (53) 

9{n)/n 
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and the expressions (|5H|I and (|52|l are obviously equal if n = 1 or 71 = 2. We know that the (a, a) 
regenerative composition structure is symmetric, hence reversible. So for ^aiji) := [1 + a]„_i/(n — 1)!, 
the identity P(F„ = 1) = P(L„ = 1) together with ^ and lO yields 



1 



E 

k=2 



1\ (-1)^ 



(54) 



Suppose now that a regenerative composition structure is such that P(F„ = 1) = P(-L„ — 1) for all 
n = 1, 2, . . ., and let us prove by induction that its Laplace exponent <f> normalised by $(1) — 1 is such 
that 

*(s) -$a(s) 

for all s = 1, 2, . . ., where a £ [0, 1] is defined by for s = 2, that is $(2) — 1 
and H52|l . we have for all n = 2, 3, . . . that 



(55) 

a. According to 



$(n)-$(n-l) (-1)" 
= n — n- 



$(n)/n 



$(n) 



fc=2 



/fc- ij $(fc) 



(56) 



so if we make the inductive hypothesis that H55|) holds for all s < n — 1 then we read from (|54|l and (|56|l 
that 



$(n) - $(n - 1) 
$(n)/n 



n- 1 



+ n(-l)^ 



1 



1 



$a(n) $(n) 



which yields the expression 

^n) = (<I>„(n - 1) - (-1)")/(1 - a(ri - 1 - a) - (-l)"/$„(n)). 

But we know this formula holds for <i>(r7,) = $„(«), so this must be the unique solution of the recursion, 
and the inductive step is established. Finally, the sequence $(1), $(2), . . . determines $(s) for all s > 0, 
by consideration of the second formula in (|19|1 . and the fact that a finite measure on [0, 1] is determined 
by its moments. □ 



11 Transition probabilities 

Transition probabilities describing the succession of random compositions (C„) or ordered partitions (C*) 
as n grows follow at once from the product formula © for the composition probability function. For 
ordered partitions of [n] these transition probabilities can be read immediately from as indicated in 
James [HI §5.4]. 

Assuming that C* — {Ai, . . . , Ak), an ordered partition C*^_^^i of [n + 1] is obtained either by inserting 
singleton block {n + 1} into the sequence Ai, . . . , A^ or by adjoining the element n + 1 to one of the 
blocks. It is easy to compute that n + 1 is inserted before Ai with probability 

qjn + l: 1) 
n + 1 

or adjoined to Ai with probability 

ni + 1 q{n + 1 : ni + 1) 
n+1 q{n : ni) 

Inductively, with probability 




q{N, + 1:1) _ n, + 1 g(jV, + 1 : n, + 1) \ 



n + 1 is neither inserted immediately before nor adjoined to one of the blocks Ai, . . . , Aj, and conditionally 
on this event (and given (Ai, . . . , A^)) this element is inserted as a singleton immediately following Aj 
with probability 

g(jV,-+i + 1:1) 
Nj+i + 1 
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or adjoined to Aj+i (for j < k) with probability 



nj+i + 1 q{Nj+i + 1 : Uj+i + 1) 
Nj+i + 1 qiNj+i : Uj+i) 

Here, the rii are the sizes of the Ai and the Ni are as in ©• 

A transition law for integer compositions follows from the above. It is exactly the same as for the 
analogous ordered set partitions with the exception of the case when a composition of n is changed by 
appending a 1 to a series of unit parts like 1,1,...,!, in which case the transition probability is obtained 
by summation of individual probabilities of all possible singleton insertions into the series. 



12 Interval partitions 

The above probabilities of the two kinds of transition (insertion and joining) are equal to the expected 
sizes of intervals of a partition of [0, 1] induced by a uniform sample of n points and TZ. From this 
viewpoint, a better prediction of the 'future' compositions arising when more points are added to the 
sample is obtained by conditioning on the actual sizes of intervals. 

At first we shall describe a somewhat simpler distribution of the interval sizes for the [0, cx^J-partition, 
which can be seen as discretisation of a subordinator in the spirit of [23 Sections 3 and 4]. For each n, 
a random set TZ and exponential order statistics ei„, . . . , e„„ induce a partition of [0, oo] associated with 
finite composition C„. The partition is comprised of two kinds of parts: those containing some sample 
points or not. The parts of the first kind are either open interval components of TZ'^ which contain at least 
one of the ej„ 's, or one-point parts {^jn} corresponding to ejn G TZ and appearing with positive probability 
only for d > 0. The parts of the second kind are the connected components (intervals or separate points) 
of the set resulting from removing parts of the first kind. The parts of different kinds interlace and if C„ 
has Kn classes there are 2Kn + 1 pieces of the partition, say Ji„, /i„, . . . , JK„-i.ni lK„,n, JK„+i,n, which 
can be open or semiopen intervals or one-point sets. Let Gi„, -ffin, . . . , GK„-i,n, HK„,n, GK„+i,n be the 
sizes of the parts, with slight abuse of language we will call them 'intervals', with understanding that 
some of them can degenerate into a point. 

Theorem 12.1 The distribution of the random sequence Gin, Hin, ■ ■ ■ ,GKn-i,n, HK„.n,GKn+i.n of 
interval sizes has the following properties: 

(i) given the composition C„ all interval sizes are conditionally independent, 

(ii) Gin *s independent of Cn and also independent of other interval sizes, and has Laplace transform 

Eexp(-sG'i„)= ^/ (57) 

(iii) the unconditional distribution of Hin is given by 

P {Hm £ dz) ^ v{Az) + -— ^o(dz), (58) 

<I>(ri) <3>(n) 

and given Cn the analogous conditional distribution of i?i„ is 

{^) (1 - e"^)"' e-("-")^ iy{dz) + nd l(m = 1) So{dz) 
$(n : m) 

where m is the first part of Cn, 

(iv) conditionally on the event that the first j — I parts of Cn sum up to m, the truncated sequence 
Gjn, Hjn, ■ ■ ■ , HK„,n,GK„+i.n is independent of the variables Gin , Hin , ■ ■ ■ , Gj-i^n , -Hj-i,n and of 
the first j — 1 parts of composition Cn, and has the same distribution as the interlacing sequence 

Gl_n — m ; -^l,n — m i ■ ■ • ; ^Kn-m—j-n—m ; G }^^_^—jj^i_n—m, 

of interval sizes associated with the composition Cn-m of integer n — m. 
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Proof. The independence claims involved in (i) and (iv) follow from the memoryless property of the 
exponential distribution and the strong Markov property of TZ applied at the right endpoints of intervals 
Ij or Jj. Formulas (|37|l . follow from Lemma 1^31 and the second formula in (iii) follows by routine 
conditioning. □ 

Mapping [0, oo] to [0, 1] by z 1 — e^^ sends the partition of [0, oo] to a partition of the unit interval, 
say Jin , Iin , • ■ • , Ik^ , Jk^+Ij which is the partition induced by a uniform sample and a multiplicatively 
regenerative set TZ. The probability law of the partition of [0, 1] follows from Theorem 112.11 Thus, 
by virtue of the identity E(l — Gi„)'* = ]Eexp(— sGi„) the Laplace transform H57|l becomes a Mellin 
transform. Similarly, the ratio i?in/(l — Gi„) is independent of Gi„ and has distribution 

Hin \ 1 - (1 - a;)" ~, , , nd , , , , 



\l-Gln J *N 

The distribution of the rest intervals follows recursively, by scaling with factor (1 — Gi„ — iJi„)~^. 

The sizes of these 2Kn + 1 intervals, say Gj„ and Hjn , determine the law of the extended composition 
when adding new sample points. For example, 

EG,„.1-E(1-G,„)^1 ''^^^ '^(^ + 1^1) 



$(n + l) (n + l)$(n + l) 

which by H26|) is equal to q{n +1 : l)/(n + 1) in accord with Section^] The sizes also have a transparent 
frequency interpretation in terms of the infinite composition C. For example, Gi„ is the total frequency 
of the classes of C* strictly preceding the first class represented in C*, and iJi„ is the frequency of the 
first class represented in C*. 

Tripartite decomposition of [0, 1] For n — 1 the partition consists of three intervals Jn J21 
of sizes G := Gn , H := Hu, D := G21. The variable H is the frequency of the class of element 1 and its 
distribution is the structural distribution. Similarly, G is the total frequency of classes strictly preceding 
the class of 1 in C*, and D is the total frequency of classes strictly following the class of 1. 

Moments of G, H and D have clear interpretation in terms of finite compositions. Thus 

E(l-G)"-i = X^-g(n:™) = |ill (59) 

m—l ^ ^ 

is the probability that element 1 is in the first block of C* or, what is the same, that a size-biased pick 
of a part from C„ yields the first part. Similarly, 

^ „ 1 q(n : 1) $(n : 1) , , 

n n$(n) 

is the probability that {1} is the first block of C*. 

Furthermore, the random variable H can be written as a product of two independent variables 1 — G 
and H/{1 — G), hence 

/ iJ Ei/"-i _ $(n : n) 

) ^ E(l - G)"-i " $(1) ^ ^ 

which is the conditional probability that the composition C* is trivial given 1 is in the first block. 
For joint moments we have the formula 

EG' W-' ={}_{:] , 7 , 1 I ^(-1)'' Q $ (j +b:j + &) j (62) 
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(the second sum may be further converted to variables $(1), '5(2), . . .) which follows from l|59|l . I|61() and 
EiJ" = p{n) = <I>(n : n)/$(n) by the binomial expansion of 



H 



) 




H 



) 



k 



W D'' = (!-(!- G)y{l - G) 



1-G 



1 



1-G 



The joint moments have the following interpretation. Let (^1,^2, ^3) be an ordered partition of [n], 
n = i + j + k, such that 1 € A2 and the blocks are of sizes i, j and k, respectively, with i > 0, j > 1 and 
fc > 0. Then Ht)2|) is the probability that A2 is a block of C* and {Ai, A2, A3) is coarser than C*. 
It follows that 



is the probability that a size-biased pick of a part of C„ is j, and this part is preceded by a composition 
of i and followed by a composition of k (with the obvious meaning when i or A: is zero). For fc = 
this probability is equal to (j/n)P(L„ = j) where L„ is the last part of C„, computing this yields an 
alternative proof for H51|l and the formula for the Green matrix H5U|) . 

Acknowledgment Thanks to the referee for two careful readings of the paper, and for a number of 
suggestions which helped to improve the exposition. 
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